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Abstract. This paper focuses on polynomial dynamical systems over 
finite fields. These systems appear in a variety of contexts, in computer 
science, engineering, and computational biology, for instance as models 
of intracellular biochemical networks. It is shown that several problems 
relating to their structure and dynamics, as well as control theory, can 
be formulated and solved in the language of algebraic geometry. 



1. Introduction 

The study of the dynamics of polynomial maps as time-discrete dynamical 
systems has a long tradition, in particular polynomials over the complex 
numbers leading to fractal structures. An example of more recent work 
using techniques from algebraic geometry includes the study of the algebraic 
and topological entropy of iterates of monomial mappings / = (/i, • • • , /n) 
over subsets of C n [10j]. The study of iterates of polynomial mappings (or, 
more generally, rational maps) over the p-adic numbers originally arose in 



Diophantine geometry For more recent work see, e.g., 271 ] . Finally, 
there is a long tradition of studying the iterates of polynomial maps / : 
F q — > F q over finite fields, primarily usin g te chniques from combinatorics 
and algebraic number theory (see, e.g., [22II23I . 
The general case of finite dynamical systems 

f = (/l) • • • ) fn) '■ — > F™, 

with fi € Fgjxi, . . . , x n ] has long been of interest in the special case q = 2, 
which includes Boolean networks and cellular automata. They are of consid- 
erable interest in engineering and computer science, for instance. Since the 
1960s they are also being used increasingly as models of diverse biological 
and social networks. 

The underlying mathematical questions in these studies for different fields 
are similar. They relate primarily to understanding the long-term behavior 
of the iterates of the mapping. In the case of monomial mappings, the 
matrix of exponents is usually the right mathematical object to analyze by 
using different methods based on the ground field. Generally, one would like 
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to be able to use some feature of the structure of the coordinate functions 
fi to deduce properties of the structure of the phase space of the system / 
which represents the system's dynamics. For finite systems, i.e., polynomial 
dynamical systems / over a finite field k, the phase space V(f) has the 
form of a directed graph with vertices the elements of k n . There is an edge 
v — ► w in V(f) if /(v) = w. Since the graph has finitely many vertices 
it is easy to see that each component has the structure of a directed limit 
cycle, with trees (transients) feeding into each of the nodes in the cycle. 

Example 1.1. Let / : F§ — > F§ be given by f(x\, x?) = (1 — x\X2, 1 + 2x2). 
The phase space of / has two components, containing two limit cycles: one 
of length two and one of length three. See Figure Q] (right). The dependency 
relations among the variables are encoded in the dependency graph in Figure 
□ (left). 





Figure 1 . The dependency graph (left) and the phase space 
V(f) (right) of the finite dynamical system in Example 11.11 



In this case the information of interest is the number of components, the 
length of the limit cycles, and, possibly, the structure of the transient trees. 
It is also of interest to study the sequence of iterates f r of /. 

In recent years interest in polynomial dynamical systems over general 
finite fields has arisen also because they are useful as multistate general- 
izations of Boolean network models for biochemical networks, such as gene 
regulatory networks. In particular, the problem of network inference from 
experimental data can be formulated and solved in this framework. The 
tools used are very similar to those developed for the solution of problems 
in the statistical field of design of experiments and which led directly to 
the new field of algebraic statistics. (This connection is made explicit in 
[2l|] . ) Analyzing finite dynamical systems models of molecular networks 
requires tools that provide information about network dynamics from in- 
formation about the system. Since the systems arising in applications can 
be quite high-dimensional it is not feasible to carry out such an analysis by 
exhaustive enumeration of the phase space. Thus, the studies of polynomial 
dynamics mentioned earlier become relevant to this type of application. Fi- 
nally, an important role of models in biology and engineering is their use 
for the construction of controllers, be it for drug delivery to fight cancer or 
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control of an airfoil. Here too, approaches have been developed for control 
systems in the context of polynomial dynamical systems. 

Here we describe this circle of ideas and some existing results. Along the 
way we will mention open problems that arise. We believe that algebraic 
geometry can play an essential role as a conceptual and computational tool 
in this application area. Consistent with our specific research interests we 
will restrict ourselves to polynomial dynamical systems over finite fields. 



2. Finite dynamical systems 
Let A: be a finite field and let 

/ = (/!,...,/„) :fc n — fc n 
be a mapping. It is well-known that the coordinate functions fi can be 



represented as polynomials 24j, p. 369], and this representation is unique 
if we require that the degree of every variable in every term of the fi is 
less than \k\. If A; is the field with two elements, then it is easily seen that 
/ represents a Boolean network. Conversely, every Boolean network can 
be represented in polynomial form. Hence, polynomial dynamical systems 
over a finite field, which we shall call finite dynamical systems include all 
time-discrete dynamical systems over finite fields. Iteration of / generates 
dynamics which is captured by the phase space V(f) of /, defined above. We 
will first focus on the problem of inferring information about the structure 
of V(f) from the polynomials fi. 

In principle, a lot of information about V(f) can be gained by solving 
systems of polynomial equations. For instance, the fixed points of / are 
precisely the points on the variety given by the system 

fi(xi, . . . ,x n ) - xi = 0, . . . , fn(xi, ■ ■ ■ ,x n ) - x n = 0. 

The points of period 2 are the points on the variety given by the system 
fi(x±, . . . , x n ) — Xi = 0, i = 1, . . . , n, and so on. But often the most efficient 
way of solving these systems over a finite field, in particular a small finite 
field, is by exhaustive enumeration, which is not feasible for large n. A more 
modest goal would be to find out the number of periodic points of a given 
period, that is, the number of points on the corresponding variety. For finite 
fields this is also quite difficult without the availability of good general tools, 
and much work remains to be done in this direction. 

The case of linear systems is the only case that has so far been treated 
systematically, using methods from linear algebra, as one might expect. Let 
M be the matrix of /. Then complete information about the number of 
components in V(f), the lengths of all the limit cycles and the structure 
of the transient trees can be computed from the invariant factors of A, 
together with the orders of their irreducible factors [ll|. See 14] for an 



implementation of this algorithm in the computer algebra system Singular 
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There are results available for several other special families of systems, 
using ad hoc combinatorial and graph-theoretical methods. For instance, we 
have investigated the case where all the /, are monomials. It is of interest to 
characterize monomial systems all of whose periodic points are fixed points. 
For the field with two elements this characterization can be given in terms of 
the dependency graph of the system. This graph has as vertices the variables 
xi,...,x n . There is a directed edge X{ — » Xj if X{ appears in fj; see Figure 
(Deleft) for an example. Given a directed graph it can be decomposed into 
strongly connected components, with each component a subgraph in which 
each vertex can be reached from every other vertex by a directed path. 
Associated to a strongly connected graph we have its loop number which is 
defined as the greatest common divisor of the lengths of all directed loops 
in the graph based at a fixed vertex. (It is easy to see that this number is 
independent of the vertex chosen.) The loop number is also known as the 
index of imprimitivity of the graph. 

It is shown in 0] that the length of any cycle in the phase space of a 
monomial system / must divide the loop number of its dependency graph. In 
particular, we have the following result about fixed-point monomial systems. 

Theorem 2.1 (01). All periodic points of a monomial system f are fixed 
points if and only if each strongly connected component of the dependency 
graph of f has loop number 1. 

The study of monomial systems over general finite fields can be reduced 
to studying linear and Boolean systems [2| • There are other interesting fam- 
ilies of systems whose dynamics has been studied. One such family is that of 
Boolean networks constructed from nested canalyzing functions. These were 
introduced and studied in [l?]; (3j ■ The context is the use of Boolean net- 
works as models for gene regulatory networks initiated by S. Kauffman [ig| . 
We first recall the definitions of canalyzing and nested canalyzing functions 



from 171 ] . 



Definition 2.2. A canalyzing function is a Boolean function with the prop- 
erty that one of its inputs alone can determine the output value, for either 
"true" or "false" input. This input value is referred to as the canalyzing 
value, while the output value is the canalyzed value. 

Example 2.3. The function f(x,y) = xy is a canalyzing function in the 
variable x with canalyzing value and canalyzed value 0. However, the 
function f(x, y) = x + y is not canalyzing in either variable. 

Nested canalyzing functions are a natural specialization of canalyzing 
functions. They arise from the question of what happens when the function 
does not get the canalyzing value as input but instead has to rely on its 
other inputs. Throughout this paper, when we refer to a function of n 
variables, we mean that / depends on all n variables. That is, for 1 < i < n, 
there exists (a\, . . . ,a n ) £ F?? such that f(a±, . . . , Oj_i, aj, aj + i, . . . , a n ) / 
/(oi, . . . , aj_i, 1 + di, Oj+i, . . . , a n ). 
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Definition 2.4. A Boolean function / in n variables is a nested canalyzing 
function(NCF) in the variable order x\, x 2 , ■ ■ ■ , x n with canalyzing input 
values a±, . . . ,a n and canalyzed output values b±, . . . , b n , respectively, if it 
can be expressed in the form 



f(xi,x 2 , ■ ■ ■ 



h 
b 2 
h 



if x\ = oi, 

if x\ 7^ a\ and x 2 = a 2 , 

if x\ / a\ and x 2 7^ a 2 and X3 



03, 



if x\ 7^ a\ and • • • and x 
if x\ 7^ a\ and • • • and x 



-1 7^ a n- 
n T 1 ^n- 



and x n = a r 



Example 2.5. The function f(x,y,z) = x(y — l)z is nested canalyzing in 
the variable order x,y,z with canalyzing values 0,1,0 and canalyzed values 
0,0,0, respectively. However, the function f(x, y, z, w) = xy{z + w) is not a 
nested canalyzing function because if x 7^ and y 7^ 0, then the value of the 
function is not constant for any input values for either z or w. 



It is shown in 18| through extensive computer simulations that Boolean 
networks constructed from nested canalyzing functions show very stable 
dynamic behavior, with short transient trees and a small number of compo- 
nents in their phase space. It is this type of dynamics that gene regulatory 
networks are thought to exhibit. It is thus reasonable to use this class of 
functions preferentially in modeling such networks. To do so effectively it 
is necessary to have a better understanding of the properties of this class, 
for instance, how many nested canalyzing functions with a given number of 
variables there are. 



In 15J] a parametrization of this class is given as follows. The first step is 
to view Boolean functions as polynomials using the following translation: 



x A y = x ■ y, xVy = x + y + xy, 



x + 1. 



It is shown in [1 
the quotient ring R 



that the ring of Boolean functions is isomorphic to 



F 2 [x!,. . . ,x n ]/I, where / = (xj 



X j 



1 < i < n). 



Indexing monomials by the subsets of [n] := {1, . . . , n} corresponding to the 
variables appearing in the monomial, we can write the elements of it! as 



R 



{^2 csYlxi : c s e F 2 }. 



SC[n] 



As a vector space over F2, R is isomorphic to Fif via the correspondence 



(C0, ...,C[ n ]j G F 2 



SQn\ 



for a given fixed total ordering of all square- free monomials. That is, a 
polynomial function corresponds to the vector of coefficients of the monomial 
summands. The main result in 11511 is the identification of the set of nested 
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canalyzing functions in R with a subset V nc f of F| n by imposing relations 
on the coordinates of its elements. 

Definition 2.6. Let a be a permutation of the elements of the set [nj. We 
define a new order relation < CT on the elements of [n] as follows: a(i) < a a(j) 
if and only if i < j. Let be the maximum element of a nonempty subset 
S of [n] with respect to the order relation < a . For any nonempty subset 
S of [n], the completion of S with respect to the permutation a, denoted by 
[rf ], is the set [r§] = {<r(l), <t(2), . . . , a(r 5 )}. 

Note that, if cr is the identity permutation, then the completion is [r$] '■= 
{1,2, ... ,rs}, where is the largest element of S. 

Theorem 2.7. fTIl . Thm. 1] Let f 6 R and let a be a permutation of 
the set [n]. The polynomial f is a nested canalyzing function in the or- 
der x a (i\,x„n\, ... ,x a f n \, with input values a a a\ and corresponding output 
values b a r{\,l < i < n, if and only if c\ n ] = 1 and, for any proper subset 
S C [n], 

<r(i)e[r%]\S 

Corollary 2.8. FZ1 . Cor. 1] The set of points mF|™ corresponding to nested 
canalyzing functions in the variable order x^m , x a M\ > • • • i x a(n) > denoted by 
Va C ^ , is defined by 

V a Cf = {( c 0> ■ • • . c [n]) e Ff : c [n] = 1, c s = c [r a] C[ n ]\ {o .(j)}, /or 5 C [n]}. 

«r(*)£[r|]\S 



It was also shown in [15| that 

yncf _ ync/ ^ 

Counting the points on this variety for small values of n resulted in an inte- 
ger sequence, which, with the help of the On-Line Encyclopedia of Integer 
Sequences Qhttp : //www. research. att . com^nj a s/seque nces/ty led to the 
realization that the class of nested canalyzing functions is identical to the 
class of unate cascade functions that has been studied extensively in com- 
puter engineering literature. In particular, using this equality, one obtains 



a recursive formula for the number of nested canalyzing functions, see |15l . 
Corollary 2.11]. 

It is shown in that the sets V„ c ^ are the irreducible components of 
yncf _ p rec i se iy 5 it j s shown that for all permutations a on [n], the ideal of 

the variety , denoted by I a := I(V™ c f), is a binomial prime ideal in the 
polynomial ring F2[{c5 : S C [n]}], where F2 is the algebraic closure of F2. 

It remains to study this toric variety in more detail. In particular, a 
generalization of the concept of nested canalyzing function to larger finite 
fields remains to be worked out. Also, the approach taken here can be 
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applied to other classes of functions important in network modeling, such 
as threshold functions or monotone functions. 

3. Network inference 

Our motivation for much of the research described in the previous section 
comes from our work on one of the central problems in computational sys- 
tems biology. Due to the availability of so-called "omics" data sets it is now 
feasible to think about making large-scale mathematical models of molecular 
networks involving many gene transcripts (genomics), proteins (proteomics), 
and metabolites (metabolomics). One possible approach to this problem is 
to build a phenomenological model based solely or largely on the experi- 
mental data which can subsequently be refined with additional biological 
information about the mechanisms of interaction of the different molecular 
species. That is, given a data set, we are to infer a "most likely" math- 
ematical or statistical model of the network that generated this data set. 
The biggest challenge is that typically the network that generated the data 
is high-dimensional (hundreds or thousands of molecular species/variables) 
and the available data sets are typically very small (tens or hundreds of data 
points). Also, general properties of such networks are not well-understood 
so that there are few general selection criteria. Therefore, it is not feasible 
to apply many of the existing network inference methods. In this context, 
two pieces of information about a molecular network are of interest to a life 
scientist: the "wiring diagram" of the network indicating which variables 
causally affect which others, and the long-term dynamic behavior of the 
network. Some network inference methods give only information about the 
wiring diagram, others provide both. 

One possible modeling framework is that of polynomial dynamical systems 
over finite fields. In the 1960s, S. Kauffman proposed Boolean networks as 
good models for capturing key aspects of gene regulation 11911 . In the 1990s 
so-called logical models were proposed by R. Thomas [29(] as models for 
biochemical network, which are multi-state time-discrete dynamical systems, 
with both deterministic and stochastic variants. In [13] it was shown that 
the modeling framework of polynomial dynamical systems over finite fields 
is a good setting for the problem of network inference. As pointed out, these 
systems generalize Boolean networks and have many of the same features 
as logical models. The network inference problem can be formulated in this 
setting as follows. 

Suppose that the biological system to be modeled contains n variables, 
e.g., genes, and we measure r + 1 time points po, . . . ,p r , using, e.g., gene 
chip technology, each of which can be viewed as an ra-dimensional real- 
valued vector. The first step is to discretize the entries in the Pi into a 
prime number of states, which are viewed as entries in a finite field k. If 
we choose to discretize into two states by choosing a threshold, then we 
will obtain Boolean networks as models. The discretization step is crucial 
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in this process as it represents the interface between the continuous and 
discrete worlds. Other network inference methods, such as most dynamic 
Bayesian network methods, also have to carry out this preprocessing step. 
Unfortunately, there is very little work that has been done on this problem. 
We have developed a new discretization method which is described in 0]. It 
compares favorably to other commonly used discretization methods, using 
different network inference methods. 

Given this data set, an admissible model 

f = (fl,f2,...,fn)-k n ^k n 

consists of a dynamical system / which satisfies the property that 

f(Pj) = (/i(Pi). • • • > fn{Pj)) = Pj+l- 

The algorithm in [2(J then proceeds to select such a model /, which is 
the most likely one based on certain specified criteria. This is done by first 
reducing the problem to the case of one variable, that is, to the problem of 
selecting the fi separately. For this purpose, we compute the set of all func- 
tions fi such that fi(pj) = Pj+i) that is, all polynomial functions fi whose 
value on pj is the ith coordinate of Pj+i- This set can be represented as the 
coset /° + I, where /° is a particular such function and / C k[x\, . . . , x n ] is 
the ideal of all polynomials that vanish on the given data set, also known 
as the ideal of points of pi, . . . , p r -i- The algorithm then chooses the nor- 
mal form of an interpolating polynomial f°, based on a chosen term order. 
One drawback of the algorithm is that this choice of term order is typically 
random and can of course significantly affect the form of the model. 

Modifications of the algorithm in [2(J have been constructed. The al- 
gorithm in [liil ] starts with only data as input and computes all possible 
minimal wiring diagrams of polynomial models that fit the given data and 
outputs a most likely one, based on one of several possible model scoring 
methods. It does not depend on the choice of a term order. The algorithm 
is based on the observation that if /, is the i-th. coordinate function of a 
model and p, q are data points such that /j(p) 7^ /j(q), then the function fi 
must involve at least some of the variables corresponding to coordinates in 
which p and q differ. This observation can be encoded in a monomial ideal 
M which is generated by all monomials of the form 

n x j 

Pj¥=<D 

for all pairs of points p / q such that /i(p) 7^ /i(q). Now let P = 
(xj 1} ... ,Xj t ) be a minimal prime of M. Then it is not hard to see that 
the generators of P induce a minimal wiring diagram for /j. Conversely, 
every such minimal diagram provides generators for a minimal prime of the 
ideal M. This algorithm has been implemented in Macaulay2 0]. The al- 
gorithm comes with a collection of probability distributions on the set of 
minimal primes that can be used for model selection. 
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Another approach to the problem of dependency of the model selection 
process in [2Q] on the chosen term order is taken in [f| . The algorithm there 
uses the Grobner fan of the ideal of points as a computational tool to find a 
most likely wiring diagram. It is clear that an upper bound for the number 
of different models one can obtain from the algorithm in [2(3] by varying the 
choice of term order is given by the number of cones in the Grobner fan. 
The algorithm in 0] uses information about the frequency of appearance 
of the different variables in models built for each cone to build a consensus 
wiring diagram from this collection of possible models. It computes the 
Deegan-Packel index of power [4] to rank variables in order of significance. 
This index was introduced in [7[, where it was computed using a Monte 
Carlo algorithm to generate random term orders. The use of the Grobner 
fan allows a systematic computation of this index. 

Note that the model space f + 1 contains all possible polynomial func- 
tions that fit the given data. In order to improve the performance of model 
selection algorithms it would be very useful to be able to select certain sub- 
spaces of functions that have favorable properties as models of particular 
biological systems, thereby reducing the model space. For instance, one 
might consider imposing certain constraints on the structure of the poly- 
nomials or on the resulting dynamics. The desire to find such constraints 
is what motivated the investigations described in the previous section. In 
order to select a class of polynomials with prescribed dynamics one needs 
to be able to link polynomial structure to dynamics in an algorithmic way. 
Similarly, in order to limit model selection to special classes of polynomials, 
such as nested canalyzing functions in the Boolean case, one must be able 
to identify efficiently the set of such functions from the model space f° + I. 
This problem remains open. 



4. Control of finite dynamical systems 

Control of biological systems is an important aspect of computational 
biology, ranging from the control of intracellular biochemical pathways to 
chemotherapy drug delivery and control of epidemiological processes. In 
order to apply mathematical control theory techniques it is necessary to work 
with a mathematical model of the system for which control theoretic tools 
exist. There is of course a very rich control-theoretic literature for systems 
of differential equations. However, the problem has also been considered in 
the context of polynomial dynamical systems over finite fields [HI; Hi); H^j. 
We briefly describe the general setting. 

The framework developed before needs to be slightly modified to accom- 
modate variables representing control inputs at each state transitions as well 
as constraints on these inputs and on the set of allowable initial conditions. 

Definition 4.1. A controlled finite dynamical system is a function 

F:k n xk m — > k n , 
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where the first set of variables x±, . . . ,x n represents the state variables, and 
the second set u\, . . . , u m represents the control variables. Furthermore, we 
have a system of polynomial equations 

Qi (2-1 ; . . . j x n , U\ , . . . , ii m ) — 0, i — l,...,r, 

which defines the variety of admissible control inputs, and another system 

Pj(xi,...,x n ) =0, j = 1, ...,s, 

which defines the variety of admissible initial conditions of the system. Fi- 
nally, we have another polynomial system 

U a (xi, . . . ,x n ) = 0, a = 1, . . . ,t, 

which defines the variety of admissible final states. 

A typical optimal control problem is then stated as follows. Given a con- 
trolled system F and an admissible initialization {x\, . . . ,x n ), find a sequence 
of control inputs which drive the system to an admissible final state, in such 
a way that a suitably defined cost function is minimized. There are several 
ad hoc strategies of finding an optimal controller but much theoretical work 
remains to be done in this context. 

In an application of this approach to a virus competition problem 
was given, which we describe here in some detail in order to illustrate the 
definitions. In a Petri dish the center is infected with two different suitably 
chosen virus strains. It can be observed experimentally that, as the infec- 
tion spreads to the rest of the dish a clear pattern of segmentation occurs 
between the two virus populations, rather than the expected mixing of the 
two strains. Through addition of one or the other type of virus over time 
the segmentation pattern can be influenced. For instance, it is possible to 
contain one virus strain by strategically inoculating cells in the Petri dish 
with the other strain. A possible application of this observation might be 
that one can contain the spread of a very harmful virus by the strategic 
introduction of another, less harmful virus strain. In this context it would 
be of interest to develop optimal strategies for introducing the second strain. 

This problem was treated in by representing the spread of the two 
virus populations as a polynomial dynamical system over the Galois field 
k = GF(4) as follows. The Petri dish is represented by 331 concentrically 
arranged hexagonal cells, each of which is a variable of the system. Each cell 
can take on 4 possible states, corresponding to being uninfected (White), 
infected by one of the two strains (Green or Red), or infected by both (Yel- 
low). We begin with an arbitrary initialization of the center of the Petri 
dish, which is represented by the 19 innermost cells. That is, we make an 
arbitrary assignment of the 4 colors to these cells. All remaining cells are 
assigned White. The goal is to apply a series of control inputs as the in- 
fection spreads, which prevents the Red virus from spreading to the edge 
of the Petri dish. That is, a desirable final state of the system is any state 



ON THE ALGEBRAIC GEOMETRY OF POLYNOMIAL DYNAMICAL SYSTEMS 11 

for which the outermost ring of cells is infected only with Green virus. See 
Figure [21 




Figure 2. Control objective. 



The rules governing the spread of infection are as follows: 

(1) If a cell has only one infected neighbor, then it will get the same 
type of infection. 

(2) If a cell has two infected neighbors, then one makes an assignment 
for the different possibilities (for details see 

We use the representation k = GF(4) := Z2[a] = {0, a, a 2 , a 3 }. The color 
assignment is as follows: 



Color 


Field Element 


Green 





White 


a 


Yellow 


a 2 


Red 


a 3 = 1 



If we represent the 331 cells by the variables xi, . . . , X331, with x±, . . . , x\g 
representing the center cells and X271 , . . . , £331 the cells in the outermost 
ring, then the variety U of admissible initializations of the model can be 
described as 

VO20 - a, ... , x 33 i - a) = V(l - (1 - (x 20 - a) 3 ) • • • (1 - (x 331 - a) 3 )). 

(Recall that in k = GF(4), we have 6 3 = 1 for any nonzero b S GF(A).) 

As explained, at any point in the simulation, the cells in the outer ring 
should be either green or white. Thus, we can describe the constraint variety 
V as follows. We have 

V = {x € k 331 \xi(xi -a) = 0;271 < i < 331} 
331 

= V(l- J] (l-^-axi) 3 )). 
t=271 
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The next step is to construct a state space model 

/ = (/i,...,/33i):A ; 331 ^A; 331 , 

of this experimental system, that is, a polynomial dynamical system / that 
approximates the dynamics observed in the laboratory. The coordinate func- 
tions fi are polynomials in k[x±, . . . , 2331] and represent the update rules of 
the cells X{. Since the simulated Petri dish is homogeneous, all cells are 
identical, so that /, = fj for all This is done using the algorithm in [2o| . 

Let x represent one of the 331 cells, and let y±, . . . ,y$ represent its six 
immediate neighbors. We compute a symmetric polynomial h G k[yi, . . . , yo\ 
which represents the rules for the spread of the infection. That is, for a 
given infection status of the neighboring six cells, the polynomial takes on 
the appropriate value in GF(4). At the same time, we compute the ideal of 
these points /, that is, the set of all polynomial functions in k[yi, . . . , ye] that 
vanish at these points. Since we are interested only in symmetric functions, 
let I be the ideal of symmetric functions inside the ideal /. The ideal I 
can easily be computed using computational algebra methods. Thus any 
possible symmetric polynomial function that can be a model of our system 
must be in the set h + 1. We choose as a model the normal form of h in the 
ideal /, with respect to a chosen term order. In our case, 

/ = ll + 727? + A? + A? + a 2 7i 

where 

71 = 2/1 H 1-2/6, and 

72 = ^ViVj- 

The polynomials 71 and 72 are elementary symmetric functions in the poly- 
nomial ring k[y±, . . . , y§\. Thus, the polynomial dynamical system / : A; 331 — > 
A; 331 is a state space model for our system. 

The next step in formulating the optimal control problem is to define a 
cost function for a given controller g : GF(4) 331 —> GF(4) 33 . We assume 
a uniform unit cost c for each cell that the controller infects with GREEN 
virus. Furthermore, assume that there is an "overhead" cost d attached to 
each intervention, independent of the number of cells transformed. Then the 
cost function for a controller g is given as follows. Suppose that {u\, . . . , u r } 
is a sequence of control inputs. Let q be the number of cells infected with 
GREEN during control input itj. Then 

r 

C (9, i u i}) = ^2( c ■ c i + \ s <h,o - !| • d). 
i=l 

To find a controller g that minimizes the cost function C(g) amounts to 
finding a way to control the system by transforming a minimal number of 
cells. 
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The paper |16j] contains the construction and implementation of a suitable 
controller that was experimentally verified to accomplish the stated goal of 
containing one of the two strains. The authors were not able to show that 
the chosen control strategy was optimal, however. 

It is well-known that optimal control of non-linear systems is difficult and 
few general tools are available. A common ad hoc approach is to "work 
backwards" from a desired end state and reconstruct control inputs in this 
way. One might hope that a suitable formulation of the problem in the 
language of algebraic geometry might allow the use of new tools from this 
field. 



5. Discussion 

It was shown in this paper that several problems about polynomial dy- 
namical systems over finite fields can be formulated and possibly solved in 
the context of algebraic geometry and computational algebra. In particu- 
lar, the problem of relating the structure of the defining polynomials to the 
resulting dynamics can be approached in this way. Likewise, the inference 
of a system from a given partial data set is amenable to a solution using 
tools from algebraic geometry. Finally, the beginnings of a control theory 
for such systems is expressed in the language of ideals and varieties. 

However, it is clear that the results presented here barely scratch the 
surface of the problems and of what could be accomplished with the use of 
more sophisticated algebraic geometric tools. 
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